Skip to content

gpu: Support custom GPU counter groups#5400

Open
dreveman wants to merge 1 commit intomainfrom
dev/reveman/gpu-groups
Open

gpu: Support custom GPU counter groups#5400
dreveman wants to merge 1 commit intomainfrom
dev/reveman/gpu-groups

Conversation

@dreveman
Copy link
Copy Markdown
Collaborator

@dreveman dreveman commented Apr 3, 2026

Extend GpuCounterDescriptor with a new GpuCounterGroupSpec message
that allows GPU counter producers to define custom counter groups.
This complements the existing fixed GpuCounterGroup enum with
producer-specific grouping.

Changes:

  • Add GpuCounterGroupSpec to gpu_counter_descriptor.proto (field 6),
    100% backwards compatible with existing protos.
  • Extend gpu_counter_group table with name and description columns.
    Legacy enum-based rows have these as NULL.
  • Parse counter_groups in gpu_event_parser for both legacy inline
    and interned descriptor paths.
  • UI GPU plugin queries custom groups and creates sub-groups under
    Counters, e.g.:
    GPU > Counters > Compute Core > Counter A
  • Document counter groups in docs/data-sources/gpu.md.

@dreveman dreveman requested a review from LalitMaganti April 3, 2026 18:44
@dreveman dreveman requested a review from a team as a code owner April 3, 2026 18:44
@LalitMaganti
Copy link
Copy Markdown
Member

LalitMaganti commented Apr 3, 2026

@dreveman really appreciate how much stuff has been improving here but I think we'll need to take a pause for a bit on reviews as this cl is non trivial and I need to think about implications here wrt to extending the concept of parent/child nested tracks outside of track event (as this is the first usecase of that).

I need to work on some internal facing things for a bit and won't be able to get to this for at least a week.

@dreveman
Copy link
Copy Markdown
Collaborator Author

dreveman commented Apr 3, 2026

@dreveman really appreciate how much stuff has been improving here but I think we'll need to take a pause for a bit on reviews as this cl is non trivial and I need to think about implications here wrt to extending the concept of parent/child nested tracks outside of track event (as this is the first usecase of that).

I was debating this myself. A single level of grouping might be enough if that helps? Which we already have today, just not implemented in the UI today, and useless for GPGPU workloads with the current predefined categories as everything basically falls in the COMPUTE category.

I need to work on some internal facing things for a bit and won't be able to get to this for at least a week.

No problem! I appreciate all the fast reviews and totally understand that you might have higher priority things to attend to.

FYI, almost everything I planned has been merged. This group improvement here and the correlation/flow stuff that I created another PR for but paused the work on are the only remaining things. Everything else needed for good GPGPU support is done as new UI plugins that can be upstreamed but can also continue to be maintained externally.

@dreveman dreveman force-pushed the dev/reveman/gpu-groups branch from eaa8298 to 03ea3f9 Compare April 4, 2026 13:29
@dreveman dreveman changed the title ui: Support custom nested GPU counter groups gpu: Support custom GPU counter groups Apr 4, 2026
@dreveman
Copy link
Copy Markdown
Collaborator Author

dreveman commented Apr 5, 2026

@LalitMaganti No rush to review this. just FYI that I removed the support for nesting from latest version as it's not clear that's needed. One level of grouping goes a long way.

Alternatively, we could use the GpuCounterBlock proto that already exists instead of introducing the GpuCounterGroupSpec but given that the groups are already described to be a way to organize counters in the UI and the block system was not added to serve that purpose it seemed like the wrong approach to me.

@LalitMaganti
Copy link
Copy Markdown
Member

That definitely makes it less controversial but still need to think about this when my plate is a bit clearer.

@dreveman dreveman force-pushed the dev/reveman/gpu-groups branch 2 times, most recently from 2805b58 to 9b7af28 Compare April 8, 2026 23:23
Extend GpuCounterDescriptor with a new GpuCounterGroupSpec message
that allows GPU counter producers to define custom counter groups.
This complements the existing fixed GpuCounterGroup enum with
producer-specific grouping.

Changes:
- Add GpuCounterGroupSpec to gpu_counter_descriptor.proto (field 6),
  100% backwards compatible with existing protos.
- Extend gpu_counter_group table with name and description columns.
  Legacy enum-based rows have these as NULL.
- Parse counter_groups in gpu_event_parser for both legacy inline
  and interned descriptor paths.
- UI GPU plugin queries custom groups and creates sub-groups under
  Counters, e.g.:
    GPU > Counters > Compute Core > Counter A
- Document counter groups in docs/data-sources/gpu.md.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants